Skip to content

Conversation

@yudataguy
Copy link
Collaborator

Hibernation Mode

Description

Added hibernation mode support to ModeManager: state persistence, wake-window handling, dormant sleep entry via PicoSleep, and safety rollbacks with a new WARNING_HI HibernationEntryFailed event if dormant entry fails after an OK response.
Ensured command/telemetry correctness: ENTER_HIBERNATION responds before reboot, load-state now defaults on read/size errors, hibernation counters roll back on failures, and load switches are restored consistently across modes.

Related Issues/Tickets

#106

This PR should be merged at the end of mode manager's PRs. Although this may not impact others since it only connects w/ safe mode manually.

How Has This Been Tested?

The test may not be complete due to the time requirement.

  • Unit tests
  • Integration tests
  • Z Tests
  • Manual testing (describe steps)

Screenshots / Recordings (if applicable)

Checklist

  • Written detailed sdd with requirements, channels, ports, commands, telemetry defined and correctly formatted and spelled
  • Have written relevant integration tests and have documented them in the sdd
  • Have done a code review with
  • Have tested this PR on every supported board with correct board definitions

Further Notes / Considerations

@yudataguy yudataguy requested a review from Copilot November 30, 2025 01:56
Copilot finished reviewing on behalf of yudataguy November 30, 2025 02:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds hibernation mode support to the ModeManager component, implementing ultra-low-power dormant sleep with RTC alarm wake, wake window handling, state persistence, and safety mechanisms (counter rollback, HibernationEntryFailed event). The implementation includes comprehensive integration tests and updates the mode enum values to support sequential transitions (HIBERNATION_MODE=0, SAFE_MODE=1, NORMAL=2, PAYLOAD_MODE=3).

Key Changes:

  • Hibernation mode with RP2350 dormant sleep via PicoSleep wrapper, configurable sleep/wake durations, and automatic wake window countdown
  • Command response timing fix: ENTER_HIBERNATION sends OK before reboot to prevent timeout
  • Counter rollback on dormant entry failure with WARNING_HI HibernationEntryFailed event notification
  • Payload mode implementation with separate load switch control (switches 6-7)
  • Enhanced test infrastructure with --run-reboot flag for hardware-only reboot tests

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 17 comments.

Show a summary per file
File Description
README.md Adds documentation for running single integration test files using TEST parameter
Makefile Implements TEST parameter for selective test execution with filename pattern matching
FprimeZephyrReference/test/long/mode_manager_test.py Updates enum values and test expectations for reordered mode constants
FprimeZephyrReference/test/int/payload_mode_test.py New comprehensive payload mode integration tests covering entry/exit, validation, and persistence
FprimeZephyrReference/test/int/hibernation_mode_test.py New hibernation integration tests with dormant failure handling and reboot tests
FprimeZephyrReference/test/int/conftest.py Adds pytest --run-reboot flag configuration and marker for hardware reboot tests
FprimeZephyrReference/ReferenceDeployment/Top/ReferenceDeploymentPackets.fppi Adds hibernation telemetry channels to Health packet
FprimeZephyrReference/Components/ModeManager/docs/sdd.md Comprehensive documentation update with requirements, sequences, state machine, and integration test coverage
FprimeZephyrReference/Components/ModeManager/PicoSleep.hpp Defines PicoSleep wrapper interface for RP2350 dormant mode
FprimeZephyrReference/Components/ModeManager/PicoSleep.cpp Implements dormant mode placeholder (sys_reboot) with native/sim failure path
FprimeZephyrReference/Components/ModeManager/ModeManager.hpp Adds hibernation state variables, wake window tracking, and expanded PersistentState structure
FprimeZephyrReference/Components/ModeManager/ModeManager.fpp Defines hibernation commands, events, and telemetry channels in FPP
FprimeZephyrReference/Components/ModeManager/ModeManager.cpp Implements hibernation entry/exit logic, wake window handling, counter management, and payload mode transitions
FprimeZephyrReference/Components/ModeManager/CMakeLists.txt Adds PicoSleep.cpp to build

this->log_ACTIVITY_HI_EnteringPayloadMode(reasonStr);
} else {
// NORMAL mode - ensure components are turned on
this->turnOnComponents();
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After restoring NORMAL mode from persistent storage (line 264), turnOnComponents() is called which turns on face switches (0-5). However, the code doesn't log any event to indicate this state restoration occurred, unlike SAFE_MODE (line 252-253) and PAYLOAD_MODE (line 260-261) which log their restoration. This inconsistency means operators won't see any event confirming NORMAL mode restoration after a reboot, making it harder to track system state changes. Consider adding a log event similar to the other modes.

Suggested change
this->turnOnComponents();
this->turnOnComponents();
// Log that we're restoring normal mode
Fw::LogStringArg reasonStr("State restored from persistent storage");
this->log_ACTIVITY_HI_EnteringNormalMode(reasonStr);

Copilot uses AI. Check for mistakes.
*.py) TARGET="FprimeZephyrReference/test/int/$(TEST)" ;; \
*) TARGET="FprimeZephyrReference/test/int/$(TEST).py" ;; \
esac; \
[ -e "$$TARGET" ] || { echo "Specified test file $$TARGET not found"; exit 1; }; \
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error check [ -e "$$TARGET" ] uses the -e test which checks if the file exists, but this will succeed for directories too. If someone passes TEST=int, it will find the directory FprimeZephyrReference/test/int and incorrectly proceed to run pytest on the directory instead of failing with "file not found". Change to [ -f "$$TARGET" ] to specifically check for a regular file, or adjust the logic to handle directories explicitly.

Suggested change
[ -e "$$TARGET" ] || { echo "Specified test file $$TARGET not found"; exit 1; }; \
[ -f "$$TARGET" ] || { echo "Specified test file $$TARGET not found"; exit 1; }; \

Copilot uses AI. Check for mistakes.
@yudataguy yudataguy closed this Nov 30, 2025
@yudataguy yudataguy reopened this Nov 30, 2025
@yudataguy yudataguy requested a review from Copilot November 30, 2025 04:07
Copilot finished reviewing on behalf of yudataguy November 30, 2025 04:10
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 14 comments.

Comment on lines +629 to +640
void ModeManager ::handleWakeWindowTick() {
// Called from run_handler at 1Hz when in hibernation wake window
this->m_wakeWindowCounter++;

if (this->m_wakeWindowCounter >= this->m_wakeDurationSec) {
// Wake window elapsed, no EXIT_HIBERNATION received
// Go back to dormant sleep
this->m_inHibernationWakeWindow = false;
this->enterDormantSleep();
}
// Otherwise, continue listening for EXIT_HIBERNATION command
}
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment "// Called from run_handler at 1Hz when in hibernation wake window" is helpful, but it would be beneficial to document what happens when the wake window expires. Specifically, that enterDormantSleep() is called again, which will increment counters and potentially cause another reboot cycle.

Consider expanding the comment to: "// Called from run_handler at 1Hz when in hibernation wake window. When counter reaches wakeDurationSec, automatically re-enters dormant sleep for the next hibernation cycle."

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +24
def pytest_addoption(parser):
"""Add custom command-line options for integration tests."""
parser.addoption(
"--run-reboot",
action="store_true",
default=False,
help="Run tests that trigger system reboot (hibernation tests)",
)
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring for pytest_addoption says "Add custom command-line options for integration tests" but doesn't explain what the --run-reboot flag does or when it should be used.

Consider expanding the docstring to:

"""Add custom command-line options for integration tests.

--run-reboot: Enable tests that trigger actual system reboots. These tests
are skipped by default because they require physical hardware and cause
the system to restart. Only use on real hardware for hibernation testing.
"""

Copilot uses AI. Check for mistakes.
Comment on lines 277 to 288
} else {
// Read failed or size mismatch (possibly old struct version)
// Initialize to safe defaults and configure hardware
this->m_mode = SystemMode::NORMAL;
this->m_safeModeEntryCount = 0;
this->m_payloadModeEntryCount = 0;
this->m_hibernationCycleCount = 0;
this->m_hibernationTotalSeconds = 0;
this->m_sleepDurationSec = 3600;
this->m_wakeDurationSec = 60;
this->turnOnComponents();
}
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the loadState() function, when the read fails or has a size mismatch (line 277), the code initializes to default values and calls turnOnComponents(). However, this could be problematic if the system was actually in HIBERNATION_MODE before the read failure.

If a read error occurs during hibernation wake window restoration, the system will:

  1. Default to NORMAL mode (line 280)
  2. Turn on all face switches (line 287)
  3. Not properly initialize hibernation state variables (m_inHibernationWakeWindow remains false from constructor)

This means that if the persistent state file becomes corrupted during a hibernation cycle, the system will incorrectly restore to NORMAL mode with all switches on, potentially violating power constraints that hibernation mode was trying to maintain.

Recommendation: Consider logging a critical warning when state restoration fails, and possibly defaulting to SAFE_MODE instead of NORMAL to maintain a conservative power profile.

Copilot uses AI. Check for mistakes.
@yudataguy yudataguy requested a review from Copilot November 30, 2025 04:37
Copilot finished reviewing on behalf of yudataguy November 30, 2025 04:42
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 22 comments.

// Enter dormant mode - processor will halt until AON timer alarm
static void go_dormant(void) {
// Enable deep sleep in the processor (Cortex-M33)
scb_hw->scr |= M33_SCR_SLEEPDEEP_BITS;
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The M33_SCR_SLEEPDEEP_BITS constant is used without being defined or included. This is likely a Cortex-M33 specific constant that should come from CMSIS headers. Ensure this constant is properly defined, or use the standard SCB_SCR_SLEEPDEEP_Msk from CMSIS core_cm33.h instead.

Copilot uses AI. Check for mistakes.
Comment on lines +250 to +260
if (this->m_mode == SystemMode::HIBERNATION_MODE) {
// Woke from dormant sleep - enter wake window
// Note: Hibernation counters (cycleCount, totalSeconds) were already
// incremented in enterDormantSleep() BEFORE the dormant sleep and saved
// to persistent storage. We're restoring those pre-incremented values.
// Keep all load switches OFF - we're in minimal power mode
this->turnOffNonCriticalComponents();

// Start wake window (radio is already initializing in Main.cpp)
// startWakeWindow() only logs an event and sets up state, no counter changes
this->startWakeWindow();
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states "Hibernation counters (cycleCount, totalSeconds) were already incremented in enterDormantSleep() BEFORE the dormant sleep and saved to persistent storage. We're restoring those pre-incremented values." However, this is slightly misleading. The counters are incremented in enterDormantSleep(), but only if the system successfully enters dormant mode and reboots. If dormant entry fails, the counters are rolled back. Consider clarifying this comment to mention that these are the "successfully saved" values after dormant entry, not necessarily pre-incremented values that might have been rolled back.

Copilot uses AI. Check for mistakes.
Comment on lines 198 to 211
go_dormant();

// If we get here, we woke up successfully from dormant!
// Disable the alarm now that we're awake
aon_timer_disable_alarm();

// Restore clocks (partial - PLLs require full init)
sleep_power_up();

// Due to complexity of restoring full Zephyr clock state,
// we do a warm reboot to ensure clean state
// The state file still has HIBERNATION_MODE, so loadState() will
// detect this and start the wake window
sys_reboot(SYS_REBOOT_WARM);
Copy link

Copilot AI Nov 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code calls go_dormant() and then expects execution to continue at line 202 with aon_timer_disable_alarm(). However, according to the implementation at line 211, after waking from dormant, the code immediately calls sys_reboot(SYS_REBOOT_WARM) which never returns. This means lines 202-205 (aon_timer_disable_alarm() and sleep_power_up()) are unreachable code and will never execute.

Either remove these unreachable lines or restructure the code so that the alarm disable and clock restoration happen before the reboot decision is made.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Copilot AI commented Nov 30, 2025

@yudataguy I've opened a new pull request, #150, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI commented Nov 30, 2025

@yudataguy I've opened a new pull request, #151, to work on those changes. Once the pull request is ready, I'll request review from you.

Copy link
Contributor

Copilot AI commented Nov 30, 2025

@yudataguy I've opened a new pull request, #152, to work on those changes. Once the pull request is ready, I'll request review from you.

@yudataguy
Copy link
Collaborator Author

Not really sure about the PICO SDK implementation of sleep timer in zephyr, i'll just leave as it is for now. Need more testing to make sure it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants